Analysis of unsupervised dimensionality reduction techniques
نویسنده
چکیده
Domains such as text, images etc contain large amounts of redundancies and ambiguities among the attributes which result in considerable noise effects (i.e. the data is high dimension). Retrieving the data from high dimensional datasets is a big challenge. Dimensionality reduction techniques have been a successful avenue for automatically extracting the latent concepts by removing the noise and reducing the complexity in processing the high dimensional data. In this paper we conduct a systematic study on comparing the unsupervised dimensionality reduction techniques for text retrieval task. We analyze these techniques from the view of complexity, approximation error and retrieval quality with experiments on four testing document collections.
منابع مشابه
2D Dimensionality Reduction Methods without Loss
In this paper, several two-dimensional extensions of principal component analysis (PCA) and linear discriminant analysis (LDA) techniques has been applied in a lossless dimensionality reduction framework, for face recognition application. In this framework, the benefits of dimensionality reduction were used to improve the performance of its predictive model, which was a support vector machine (...
متن کاملAn Empirical Comparison of DimensionalityReduction Techniques for Pattern Classi
To some extent or other all classiiers are subject to the curse of dimensionality. Consequently, pattern classiication is often preceded with nding a reduced dimensional representation of the patterns. In this paper we empirically compare the performance of unsupervised and supervised dimensionality reduction techniques. The data set we consider is obtained by segmenting cells in cytological pr...
متن کاملA Comparison of Dimensionality Reduction Techniques for Unstructured Clinical Text
Much of clinical data is free text, which is challenging to use together with machine learning, visualization tools, and clinical decision rules. In this paper, we compare supervised and unsupervised dimensionality reduction techniques, including the recently proposed sLDA and MedLDA algorithms, on clinical texts. We evaluate each dimensionality reduction method by using them as features for tw...
متن کاملStudy on Dimensionality Reduction Techniques and Applications
Data is not collected only for data mining. Data accumulates in an unprecedented speed. Data preprocessing is an important part for effective machine learning and data mining. Data mining is discovering interesting knowledge from large amounts of data, which is the integral part of the KDD (Knowledge Discovery in Databases), which is the overall process of converting raw data into useful inform...
متن کاملRough Set-based Dimensionality Reduction for Supervised and Unsupervised Learning
The curse of dimensionality is a damning factor for numerous potentially powerful machine learning techniques. Widely approved and otherwise elegant methodologies used for a number of different tasks ranging from classification to function approximation exhibit relatively high computational complexity with respect to dimensionality. This limits severely the applicability of such techniques to r...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Comput. Sci. Inf. Syst.
دوره 6 شماره
صفحات -
تاریخ انتشار 2009